Online gradient descent

Like (offline) gradient descent but instead of $f$ , we use $f_i$ , $i \in 1,…,T$

$\mathbf{x}^* = \arg \min_{\mathbf{x}} f_i(\mathbf{x})$ (the offline optimum)

Assume:

$f_1,.,,f_T$ are all convex
Each is G-Lipschitz: for all $\mathbf{x}$ , $i$ , $||\nabla f_i(\mathbf{x})||_2 \leq G$
starting radius: $||\mathbf{x}^{*}-\mathbf{x}^{(1)}||_2 \leq R$

Online Gradient descent:

Choose $\mathbf{x}^{(1)}$ and $\eta = \frac{R}{G \sqrt{T}}$ .
For i=0,…,Ti=0,…,T:
- Play $\mathbf{x}^{(i)}$
- Observe $f_i$ and incur cost $f_i(\mathbf{x}^{(i)})$
- $\mathbf{x}^{(i+1)}=\mathbf{x}^{(i)}-\eta \nabla f_i(\mathbf{x}^{(i)})$

Online gradient descent analysis

Online gradient descent regret bound

#incomplete